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^\ \ Abstract. We prove the effective version of Birkhoff's ergodic theo- 

rem for Martin-L6f random points and effectively open sets, improving 

i -~h ' the results previously obtained in this direction (in particular those of 

C$ , V. Vyugin, Nandakumar and Hoyrup, Rojas). The proof consists of two 

steps. First, we prove a generalization of Kucera's theorem, which is a 
particular case of effective ergodic theorem: a trajectory of a computable 
ergodic mapping that starts from a random point cannot remain inside 

Cn ■ an effectively open set of measure less than 1. Second, we show that the 

full statement of the effective ergodic theorem can be reduced to this 
special case. Both steps use the statement of classical ergodic theorem 
but not its usual classical proof. Therefore, we get a new simple proof of 
the effective ergodic theorem (with weaker assumptions than before). 
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Q ■ Miller and Ng. 
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1 Introduction 

The classical setting for the ergodic theorem is as follows. Let A" be a space with 
a probability measure /u on it, and let T: X — >• X be a measure- preserving trans- 
formation. Let / be a real-valued integrable function on X. Birkhoff's ergodic 
theorem (see for example [Shi96] ) says that the average value 

f{x) + f(T(x)) + f(T(T(x))) + ... + f(T("-V(x)) 



has a limit (as n — > oo) for all x except for some null set, and this limit (the 
"time-average") equals the "space average" Jf(x) dfj,(x) if the transformation T 
is ergodic (i.e., has no non-trivial invariant subsets). 
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The classical example of an ergodic transformation is the left shift on Cantor 
space fl (the set of infinite binary sequences, also denoted by 2 N or 2"): 

cr(w(0)w(l)w(2) • • • ) = w(l)w(2)w(3) . . . 

The left shift preserves Lebesgue measure (a.k.a. uniform measure) /i on Q and 
is ergodic. Therefore, the time and space averages coincide for almost every 
starting point uj. For a special case where / is an indicator function of some 
(measurable) set A, we conclude that almost surely (for all uj outside some null 
set) the fraction of terms in the sequence 

uj,a(uj),a(a(uj)),... 

that are inside A, converges to the measure of A. 

It is natural to ask whether Birkhoff ergodic theorem has an effective version 
for individual points saying that for a Martin-L6f random starting point the 
time average coincides with the space average (under some effect ivity assump- 
tions for the space and the transformation). This question was posed by van 
Lambalgen [vL87] and answered by Vyugin [Vyu97] who proved this statement 
for the case of computable function / (he also proved the convergence result 
for non-ergodic transformations) . The result was later extended to larger classes 
of functions [Nan08,HR09b]. However, we cannot directly apply these results 
to an indicator function of an effectively open set (recall that an open set U 
is effectively open if there is a computably enumerable set S of finite strings 
such that U consists exactly of the infinite sequences having a prefix in S) . In- 
deed, the characteristic function of such a set is not computable (it is only lower 
semicomputable, i.e., it is the limit of a non-decreasing sequence of computable 
functions) . So for effectively open sets (and lower semicomputable functions) the 
question remained open. 6 

In this paper we answer this question and show that effective ergodic theo- 
rem remains true for effectively open sets and lower semicomputable functions 
(Section 3). The proof goes in several steps. 

First, in Section 2 we consider the following corollary of Birkhoff's ergodic 
theorem: if A has positive measure, for almost every starting point at least one 
element of the trajectory belongs to A. Switching to complements: if A has 
measure less than 1, then (almost surely) some points in the trajectory are out- 
side A. An effective version of this statement (for effectively open sets of measure 
less than I and left shifts in Cantor space) was proved by Kucera [Kuc85]. We 
reproduce Kucera's proof and prove several similar statements. (Most of them 
are consequences of the general results of Section 3, so the direct proofs are 
redundant, but they are nice and simple.) 

Then in Section 3 we consider the general effective ergodic theorem. In Sec- 
tion 3.1 we prove a general version of Kucera's theorem for computable ergodic 
transformations of Cantor space. Then (Section 3.2) we show how the effective 



It was proved in [HR09b] that the result holds for any effectively open set whose 
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version of ergodic theorem for effectively open sets and lower semicomputablc 
functions can be reduced to classical Birkhoff's ergodic theorem and the gen- 
eral version of Kucera's theorem proved in Section 3.1. Finally, we outline the 
generalization of these results to other probability spaces (Section 3.3). 

In Section 4 we use the results of Section 2 to provide a generalized version 
of van Lambalgen's theorem (generalizing an earlier result of Miyabe) . 

The results of Sections 2 and 4 were presented at the Computability in Europe 
conference (and published in its proceedings [BDMS10]). The improvement in 
this paper is Theorem 8, showing that one can go further and reduce the general 
effective version of Birkhoff's ergodic theorem for effectively open sets to this 
special case. This last result was obtained independently in [FGMN]. 

2 Variations of Kucera's theorem 

In this section, we prove several variants of Kucera's theorem. Let us first recall 
the original version proved in [Kuc85]. Let a be the left shift in Cantor space 
(i.e., an ergodic transformation of this space equipped with uniform measure). 

Theorem 1. If A is an effectively open subset of the Cantor space of measure 
less than 1, then for every Martin-L of random sequence u> at least one of its tails 
u>, o~(u>), o~(o~(u>)),. . . does not belong to A. 

Recalling the definition of Martin-L6f randomness (a sequence is random if 
it is outside any effectively null set) we can reformulate Kucera's theorem as 
follows: 

Let A be an effectively open set of measure less than 1 . Consider the set 
A* of all sequences u) such that every tail <j( n > (u>) belongs to A. Then A* 
is an effectively null set. 

Before presenting the proof, let us mention an interpretation of this result. 
Recall that the universal Martin-L6f test is a computable sequence U\ , U2, ■ ■ ■ 
of effectively open sets such that n(Ui) < 1/2 1 and the intersection PljE/j is the 
maximal effectively null set, i.e., the set of all non-random sequences. Kucera's 
theorem shows that randomness can be (in a paradoxical way) characterized by 
U± alone: a sequence is non-random if and only if all its tails belong to U\. (In 
one direction it is Kucera's theorem, in the other direction we need to note that 
a tail of a non-random sequence is non-random.) 

Proof (of Kucera's theorem). We start with the following observation: it is 
enough to show that for every interval /, we can uniformly construct an ef- 
fectively open set J C I that contains IDA* and such that /z( J) < rfi(I) for 
some fixed r < 1 (here we call an interval any set of type x£l, where x is some 
finite string, i.e., the set of infinite binary sequences that start with x). Then 
we represent the effectively open set A of measure r < 1 as a union of disjoint 
intervals I\, I2, ■ ■ ., construct the sets Ji for every Jj and note that the union A\ 
of all Ji is an effectively open set that contains A* and has measure r 2 or less. 



Splitting A\ into disjoint intervals and repeating this argument, we get a set A^ 
of measure at most r 3 , etc. In this way we get a effectively open cover for A* of 
arbitrarily small measure, so A* is an effectively null set. 

It remains to show how to find J given I. The interval I consists of all 
sequences that start with some fixed prefix x, i.e., I = xQ. Since sequences in 
A* have all their tails in A, the intersection IDA* is contained in xA, and the 
latter set has measure r/u(7) (where r = /j,(A)). □ 

Note that this proof also shows the following: suppose A is an effectively 
open set of measure less than 1, and A can be written as a disjoint union of 
intervals A = xiflUx2&, l J ■ ■ ■■ Let lo be an infinite sequence that can be written 
as lo — W1W2W3 . . . where for all i, Wi = Xj for some j. Then u> is not random. (If 
A contains all non-random sequences, the reverse implication is also true, and 
we get yet another criterion of randomness.) 

2.1 Effective Kolmogorov 0-1 law 

Trying to find characterizations of randomness similar to Kucera's theorem, one 
may look at Kolmogorov's 0-1-law. It says that any measurable subset A of the 
Cantor space that is stable under finite changes of bits (i.e. if lo € A and ui' 
is equal to a; up to a finite change of bits, then lo' 6 A) has measure or 1. 
It can be reformulated as follows: let A be a (measurable) set of measure less 
than 1. Consider the set A* defined as follows: wei' if and only if all sequences 
that are obtained from lo by changing finitely many terms, belong to A. Then 
A* has measure zero (indeed, A* is stable under finite changes and cannot have 
measure 1). Note also that we may assume without loss of generality that A is 
open (replacing it by an open cover of measure less than 1). 

A natural effective version of Kolmogorov's 0-1-law can then be formulated 
as follows. (In fact, this statement was considered and proved by Kucera but 
was not explicitly mentioned in [Kuc85].) 

Theorem 2. Let A be an effectively open set of measure r < 1. Consider the 
set A* of all sequences that belong to A and remain in A after changing finitely 
many terms. Then A* is an effectively null set. 

(As we have seen, the last two sentences can be replaced by the following claim: 
any Martin-Lof random sequence can be moved outside A by changing finitely 
many terms.) 

Proof. To prove this effective version of the 0-1-law, consider any interval I. As 
before, we want to find an effectively open set U C I that contains A* D I and 
has measure at most r/j,(I). Let x be the prefix that defines I, i.e., I = xil. For 
every string y of the same length as x, consider the set A y = {lo \ yuo e A}. 
It is easy to see that the average measure of A y (over all y of a given length) 
equals fi(A) = r. Therefore, the set B = f] A y (which is effectively open as 
an intersection of an effectively defined finite family of open sets) has measure 
at most r. Now take U = xB. Let us show that U is as wanted. First, U is 



an effectively open set, contained in /, and of measure r/j,(I). Also, it contains 
every element of A* n /. Indeed, ii a <E A* P\ I, x is & prefix of a, so one can 
write a — x/3. Since a G A* , any finite variation of a is in A, so for all y of the 
same length as x, y/3 G A. Therefore, (3 is in all A y , and therefore is in B. Since 
a = xf3, it follows that a is in xB = U. □ 

2.2 Adding prefixes 

We have considered left shifts (deletion of prefixes) and finite changes. Another 
natural transformation is the addition of finite prefixes. It turns out that a similar 
result can be proven in this case (although the proof becomes a bit more difficult). 

Theorem 3. Let A be an effectively open set of measure r < 1. Let A* be the 

set of all sequences u> such that xuj G A for every binary string x. Then A* is 
an effectively null set. In other words, for every Martin-Lof random sequence to 
there exists a string x such that xlo (jz A. 

Proof. To prove this statement, consider again some interval / = xVL. We want 
to cover A* f) I by an effectively open set of measure r/i(I). (In fact, we get a 
cover of measure sfi(I) for some constant s G (r, 1), but this is enough.) Consider 
some string z. We know that the density of A* in / does not exceed the density 
of AuyzI — zxfl. Indeed, xlo G A* implies zxlo G A by definition of A* . 

Moreover, for any finite number of strings Zi, . . . , Zk the set A* is contained 
in the intersection of sets {to | zito G A}, and the density of A* in / is bounded 
by the minimal (over i) density of A in Z{I = ZixVL. 

Now let us choose z\, . . . , Zk in such a way that the intervals Zixfl are disjoint 
and cover fl except for a set of small measure. This is possible for the same 
reason as in a classic argument that explains why the Cantor set in [0, 1] has 
zero measure. We start, say, with z\ = A and get the first interval xfl. The rest 
of Q, can be represented as a union of disjoint intervals, and inside each interval 
ufl we select a subinterval uxfl thus multiplying the size of the remaining set by 
(I — 2~l a: l). Since this procedure can be iterated indefinitely, we can make the 
rest as small as needed. 

Then we note that the density of A in the union of disjoint intervals (and this 
density is close to r if the union covers f2 almost entirely) is greater than or equal 
to the density of A in one of the intervals, so the intersection (an effectively open 
set) has density at most s for some constant s G (r, I), as we have claimed. (We 
need to use the intersection and not only one of the sets since our construction 
should be effective even when we do not know for which interval the density is 
minimal.) □ 



2.3 Bidirectional sequences and shifts 

Recall the initial discussion in terms of ergodic theory. In this setting it is more 
natural to consider bi-infinite binary sequences, i.e., mappings of type Z — > B = 
{0, 1}; the uniform measure /i can be naturally defined on this space, too. On this 



space the transformation T corresponding to the shift to the left is reversible: 
any sequence can be shifted left or right. 

The result of Theorem 1 remains true in this setting. 

Theorem 4. Let A be an effectively open set of B z 7 of measure r < 1. The 
set A* of all sequences that remain in A after any arbitrary shift (any distance 
in any direction) is an effectively null set. 

To prove this statement, consider any s £ (r, 1). As usual, it is enough to find 
(effectively) for every interval I x an effectively open subset of I x that contains 
A* C\I X and has measure at most sfi(I x ). Here a; is a finite partial function from Z 
to B and I x is the set of all its extensions. (One may assume that x is contiguous, 
since every other interval is a finite union of disjoint contiguous intervals, but 
this is not important for us.) Then we may iterate this construction, replacing 
each interval of an effectively open set by an open set inside this interval, and 
so on until the total measure (s k , where k is the number of iterations) becomes 
smaller than any given e > 0. 

Assume that some I x is given. Note that A* is covered by every shift of A, 
so any intersection of I x with a finite collection of shifted versions of A (i.e., sets 
of type T n (A) for n € Z) is a cover for I x DA*. It remains to show that the 
intersection of properly chosen shifts of A has density at most s inside I x . To 
estimate the measure of the intersection, it is enough to consider the minimum of 
measures, and the minimum can be estimated by estimating the average measure. 

More formally, we first note that by reversibility of the shift and the invari- 
ance of the measure, we have 

n(l x nT- n (A)) = n(AnT n (l x )) 

for all n. Then we prove the following lemma: 

Lemma 5 Let J\ , . . . , Jk be independent intervals of the same measure d cor- 
responding to disjoint functions xi, ■ ■ ■ , Xk of the same length. Then the average 
of the numbers 

n(AnJ 1 ),...,ii(AnJ k ) 

does not exceed sd if k is large enough. Moreover such a k can be found effectively. 
Proof (of Lemma 5). The average equals 

- J2 E( XA ■ Xi) 

i 

where \A is the indicator function of A and Xi is the indicator function of J^. 
Rewrite this as 



E (XA^I>J' 



and note that 

i v-. 



k^' 



is the frequency of successes in k independent trials with individual probability d. 
(Since the functions Xi are disjoint, the corresponding intervals Ji are indepen- 
dent events.) This frequency (as a function on the bi-infinite Cantor space B z ) 
is close to d everywhere except for a set of small measure (by the central limit 
theorem; in fact Chebyshev's inequality is enough). The discrepancy and the 
measure of this exceptional set can be made as small as needed using a large k, 
and the difference is then covered by the gap between r and s. This ends the 
proof of the lemma. 

Now, given an interval I x , we cover I x n A* as follows. First, we take a 
integer N larger than the size of the interval I x . The intervals 

T N (I X ),T 2N (I X ),T 3N (I X ),... 

are independent and have the same measure as I x , so we can apply the above 
lemma and effectively find a k such that the average of 

n(AnT N (l x )),...,n(AnT kN (l x )) 

does not exceed sfx(I x ). This means that for some i < k one has 

n(l x n T- tN (A)) = n(A n T iN (I x )) < Sl x{I x ) 

Therefore, I x fl Pli<fe T~ lN (A) is an effectively open cover of A* of measure at 
most sfj,(I x ). D 



The statement can be strengthened: we can replace all shifts by any infinite 
enumerable family of shifts. 

Theorem 6. Let A be an effectively open set (of bi-infinite sequences) of mea- 
sure a < 1. Let S be an computably enumerable infinite set of integers. Then the 
set 

A* = {lu | lu remains in A after shift by s, for every s G S} 

is an effectively null set. 

(Reformulation: let A be an effectively open set of measure less than 1; let 
S be an infinite computably enumerable set of integers; let a be a Martin-Lof 
random bi-infinite sequences. Then there exists s E S such that the s-shift of lu 
is not in A.) 

Proof. The proof remains the same: indeed, having infinitely many shifts, we 
can choose as many disjoint shifts of a given interval as we want. □ 

The argument used to prove Theorem 4 (and Theorem 6) is more complicated 
than the previous ones (that do not refer to the central limit theorem) : previously 
we were able to use disjoint intervals instead of independent ones. In fact the 
results about shifts in unidirectional sequences (both) are corollaries of the last 
statement. Indeed, let A be an effectively open set of right-infinite sequences 



of measure less than 1. Let w be a right-infinite Martin-L6f random sequence. 
Then it is a part of a bi-inhnitc random sequence u (one may use, e.g., van 
Lambalgen's theorem [vL87] on the random pairs, see Section 4 for a precise 
statement). So there is a right shift that moves w outside A, and also a left shift 
with the same property (here by A we denote the set of bi-infinite sequences 
whose right halves belong to A). 

3 A generalization to all ergodic transformations 

3.1 Generalizing Kucera's theorem 

First let us recall the notion of a computable transformation of the Cantor 
space £7. Consider a Turing machine with a read-only input tape and write-only 
output tape (where head prints a bit and moves to the next blank position). 
Such a machine determines a computable mapping of 17 into the space of all 
finite and infinite binary sequences. Restricting this mapping to the inputs where 
the output sequence is infinite, we get a (partial) computable mapping from 17 
into ft. 

Theorem 7. Let fj, be a computable measure on Q. Let T : Q — > i7 be a partial 
computable, almost everywhere defined, measure-preserving, ergodic transforma- 
tion of Q. Let A be an effectively open subset of Q of measure less than 1. Let 
A* be the set of points x e 17 such that T n (x) G A for all n > 0. Then, A* is an 
effectively null set. 

Proof. Let r be a real number such that fi(A) < r < 1. As before, given an inter- 
val /, we want to (effectively) find an n such that L Df] i<n T~ l (A) has measure 
at most r/j,(I). This gives us an effectively open cover of A* n / having measure 
at most rfj,(L); iterating this process, we conclude that A* is an effectively null 
set. 

(A technical clarification is needed here. If we consider T only on inputs 
where the output sequence is infinite, the set T^ 1 (A) (and in general T~ l (A)) 
may no longer be open in 17. But since T is almost everywhere defined, we may 
extend T to the space $7 of infinite and finite sequences in a natural way and 
get an effectively open cover of the same measure.) 

To estimate M-f n|") i<r j T~ l (A)), we note that it does not exceed the minimal 
value of /i(IP\T~ 1 (A)), which in its turn does not exceed the average (over i < n) 
of ii(Lr)T- l (A)). This average, 

^ [»{i nA) + u,{i n T-\A)) + ... + n(ln T- n (A))} (*) 

can be rewritten as 



l 

71+1 



ti(T- n (L) n T~ n {A)) + ju(T-( n -^(J) n T~ n (A)) + ... + f i(L n T~ n (A)) 



since T is measure preserving. The latter expression is the inner product of the 
indicator function of T~ n (A) and the average a n = (xo + - • - + Xn)/{n + l), where 
Xi is the indicator function of T~ l (L). 



As n — > oo, the average a n converges in Li to the constant function fi(I), due 
to von Neumann's mean ergodic theorem. By the Cauchy-Schwarz inequality, 
this implies that the scalar product converges to n(A)n(I) and therefore docs 
not exceed r/i(I) for n large enough. 

It remains to (effectively) find a value of n for which the i^-distance between 
a n and the constant /i(I) is small. Note that for all i the set T~ l (I) is an 
effectively open set of measure /x(/) (recall that T is measure preserving), and 
fj.(I) is computable since /i is a computable measure. Therefore, for any i and 
e > 0, one can uniformly approximate T~ l (I) by its subset U that is a finite 
union of intervals such that /i(T~ l (I) \ U) < e. This means that the ^-distance 
between a n and the constant function (J,(I) can be computed effectively, and 
we can wait until we find a term with any precision needed. In particular, we 
can effectively find an n such that the average (*) is less than r. By the above 
discussion, we then have n(I D f] i<n T^ t (A)) < r/j,(I), as needed. □ 

Now we get all the theorems of Section 2 (except for Theorem 6) as corollaries: 
the effective ergodic theorem for the bidirectional shift (Theorem 4) immediately 
follows as the bidirectional shift is clearly computable, measure-preserving and 
ergodic. Remark: technically we proved Theorem 7 only for the Cantor space ft, 
but the space of functions Z^Eon which the bidirectional shift is defined, is 
computably isomorphic to ft. By this we mean that there exists a computable 
measure preserving bijection from one space to another; for example, one could 
represent a two-directional sequence . . . w(— 2)w(— l)w(0)w(l)w(2) ... by a one- 
directional sequence w(0)w(— l)u>(l)ui(— 2)w(2) . . ., and under this representation 
we can therefore represent the bidirectional shift as a measure preserving map 
from ft to itself. 

Recalling the discussion in Section 2.3, we see also that one can derive both 
Theorem 1 (Kucera's theorem for deletion of finite prefixes) and Theorem 3 
(addition of finite prefixes) from Theorem 7. 

It turns out that even Theorem 2 (finite change of bits) can be proven in this 
way. Indeed, let us consider the map F defined on ft by: 

F(l n 0w) = 0™lw for all n, and F(lllll ...)= 00000 .. . 

(F adds 1 to the sequence in the dyadic sense). It is clear that F is computable 
and measure-preserving. That it is ergodic comes from Kolmogorov's 0-1 law, 
together with the observation that any two binary sequences u, U)' that agree on 
all but finitely many bits are in the same orbit: u/ = F n (u>) for some n G Z. 
The reverse is also true except for the case when sequences have finitely many 
zeros or finitely many ones. This cannot happen for a random sequence, so this 
exceptional case does not prevent us to derive Theorem 2 from Theorem 7. 

Remark 1. Theorem 6 asserts that given a random u, and an effectively open 
set U of measure less than 1, there exists an n such that T n (uj) £ U (where T 
is the shift in the space of bidirectional sequences), and that moreover n can 
be found in a computable enumerable set fixed in advance. This of course still 



holds for the unidirectional shift on f2, but this does not hold for all ergodic 
maps. Indeed, this fact is related to the so-called strong mixing property of the 
shift, which not all ergodic maps have. For example, a rotation of the circle by 
a computable irrational angle a (i.e., a mapping x h-> x + a mod 1 on f2 seen as 
the interval [0, 1]) is a computable ergodic map that does not have this property, 
and it is easy to construct a counterexample to the claim of Theorem 6 for that 
particular map. 

3.2 An effective version of Birkhoff 's ergodic theorem 

The generalization of Kucera's theorem we proved in the previous section (The- 
orem 7) is only a weak form of ergodic theorem. It asserts that under the action 
of a computable ergodic map, the orbit of a Martin-Ldf point will intersect any 
given effectively closed set of positive measure, but it does not say anything 
about the frequency. This is what we achieve with the next theorem. 

Theorem 8. Let fi be a computable measure on Q. Let T : Q — > Q be a com- 
putable almost everywhere defined /i-preserving ergodic transformation. Let U be 
an effectively open set. For every Martin-Ldf random point ui, 
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Note that the statement is symmetric, so the same is true for an effectively 
closed set C. 

Proof. Let g n (w) = — X)fc=o Xu(T k (ui)) be the frequency of [/-elements among 
the first n iterations of w. Let us first prove that limsupg„(o;) < /J,(U). Then we 
show (see part (2) below) that liminf g n (w) > n(U). 

(1) Let r > n(U) be some rational number and let 

G N = {u: (3n > N) g n (co) > r} 

be the set of points where some far enough frequency (average of at least N 
terms) exceeds r. The set Gn is an effectively open set; indeed, the functions 
g n are lower semicomputable (uniformly in n), hence the condition g n {ui) > r is 
enumerable. The sets Gn form a decreasing sequence. We know by the classical 
Birkhoff's pointwise ergodic theorem that n(f] N Giy) — 0, since the sequence 
of functions g n converges to /j,(U) < r ^-almost everywhere. As a result, there 
exists N such that /x(Gjv) < 1- We can thus apply Theorem 7 to this Gn and 
conclude that for every Martin-L6f random w there exists k such that T k (uj) ^ 
Gn- Hence \imsup n g n (T k (go)) < r. Since a finite number of iterations does not 
change the limsup, we conclude that limsup g n (uj) < r. The number r was an 
arbitrary rational number greater than fJ,(U), so limsupg„(u;) < n(U). 

(2) We now prove that liminf (^(w) > n(U). This in fact can be deduced 
from the first part of the proof. The set X is open, so it is a countable union of 



disjoint intervals. Taking a finite part of this countable union, we get an effec- 
tively closed set C c U and can apply the previous statement to its complement. 
It says that the orbit of a Martin-L6f random point ui will be in X' with fre- 
quency at least n(X') (the upper bound for the complement of G means a lower 
bound for G). Since /u(C) can be arbitrarily close to (J,(X), we conclude that 
liminf g n {w) > n(U). D 

Remark 2. The inequality liminf g n (u>) > fJ-(X) can actually be derived from 
the algorithmic version of Birkhoff's theorem proved by V'yugin [Vyu97], since 
X' is open and closed set, but it is easier to refer to the first part of the proof. 
Note also that in this direction we do not need effectivity: liminf g n {un) > fJ*(X) 
for every open set X and every Martin-L6f random point to. Of course the other 
inequality generally fails for (non-effectively) open sets: indeed, the orbit of every 
point u can be enclosed in a (non-effectively) open set of small measure. 

Theorem 8 extends to a larger class of sets in a straightforward way. We say 
that a set A is effectively ^-approximable if ji(A) = sup{/i(F) : F effectively 
closed and FCA} = inf{/i(G) : G effectively open and A C G}. For instance, 
any Aj-sct is effectively /i-approximable. 

Corollary 1. Let X C £1 be an effectively /i-approximable set. For every Martin- 
Lof ^-random u), ]im£(xx(w) + ... + xx(T™ _1 (cj))) = n(X). 

Proof. For every e > we can apply Theorem 8 to the upper and lower s- 
approximations of X; the frequency for X is between them. □ 

Theorem 8 can also be extended a wider class of functions than characteristic 
functions of sets. 

Theorem 9. Let f : £1 — > [0, +oo] be lower semicomputable. For every Martin- 
Lof random u>, 



1 f 

lim -J2f(T k (cj))= //dyx. 

fc=o J 



Note that we allow the integral to be infinite; in this case the sequence in the 
left-hand side has limit +oo. 

Proof. Let / be a lower semicomputable function with a finite integral. Let 
/„ = — (/ + ... + / o T™ _1 ). Let r > Jfd/u, be a rational number and 

G n = {lo: (3n>N)f n {u)>r}. 

The set Gn is an effective open set and fJ,{f] N Gjv) = as f n (u) — > ffd/j, < r 
for ^.-almost every ui (by the classical version of Birkhoff's ergodic theorem). As 
a result, there exists TV such that //(Gjv) < 1- By Theorem 7, if u) is Martin-L6f 
random then there exists k such that T k (uj) £ Gm- Hence limsup/ n (T fc (w)) < r, 
and limsup/„(u;) = limsup/„(T fe (u;)) < r. Since r > Jfd/j, can be arbitrarily 
close to the integral, we have that limsup/„(cj) < J f d/i. 



It remains to prove that liminf /„(cj) > J/ d/i. This is true for every lower 
semicontinuous /. Indeed, consider some lower bound for / that is a basic func- 
tion (a linear combination of indicators of intervals) . For these basic functions the 
statement of the theorem is true (as we already know), and their integrals can be 
arbitrarily close to J f d/x. (This argument works also for the case J f d/i = +00.) 

□ 

Theorem 9 is, to the extent of our knowledge, the strongest form of effec- 
tive ergodic theorem proven so far, in the case of an ergodic transformation. In 
particular, it strengthens the results that appeared in [Vyu97,Nan08,HR09b] for 
ergodic measures. We will see in the next section that it can even be extended 
a bit further, namely to other spaces than $7 and to ergodic maps that are only 
"weakly computable" (in a sense which we will explain below). However, whether 
the Birkhoff averages of an effectively open set converge at all Martin-L6f ran- 
dom points when the measure is not ergodic remains an open problem (note that 
in the non-ergodic case, if the limit exists at a point, that limit is no longer the 
measure of the open set but depends on the particular point). 

But let us mention first an interesting consequence of Theorem 9. Recall that 
the randomness deficiency of a sequence w is defined as 

dfiiu) = sup{-log/i[wo---Wn-i] - K(u ...u) n -i)} 

n 

where K(w) is the (prefix) Kolmogorov complexity of w. 

The following was proven by Gacs [Gac80] : a sequence ui is Martin-L6f ran- 
dom with respect to fi if and only if d M (w) is finite. Moreover, t^ := 2 dfi is 
a universal randomness test in the sense that it is lower semicomputable, ji- 
integrable, and for every lower semicomputable /z-integrable / : 51 — > [0, +00] 
there exists c such that / < ct^. 

For a computable ^-preserving mapping T it is already known that if u> 
is Martin-L6f random, then so are T(u>), T 2 (u>), etc. Theorem 9 applied to t^ 
yields a stronger result for the case of ergodic T: not only the values t^,(u>), 
tfj,(T(u))), t A j(T 2 (w)), etc. are finite, but also their average is bounded. In this 
sense, the iterates of a random point are "random in the average" . It is still an 
open problem whether this still holds in the non-ergodic case. 



3.3 A final generalization: computable probability spaces 
and layerwise computable functions 

We now briefly present two "orthogonal" ways in which the previous results 
can be extended to other contexts. On the one hand, the algorithmic theory of 
randomness has been extended from the Cantor space to any computable metric 
space, where the computability of probability measures is now well understood. 
All the results presented above extend to such spaces. On the other hand, on 
the Cantor space as well as any computable metric space, the computability 
assumption on the mapping T can be weakened into layerwise computability 



introduced in [HR09a]. Intuitively, this weakening corresponds in analysis to 
replacing continuity with measurability. 

The first generalization can be carried out in two ways: the proof on the 
Cantor space can generally be adapted to any computable probability space, or 
the isomorphism between such spaces (see [HR09c]) can be used to transfer the 
result without proving it again. The second generalization is also rather direct: 
replacing computability notions with their "layerwise" counterparts generally 
leaves the proofs correct. Caution is sometimes needed and appropriate lemmas 
then have to be used (especially regarding composition of functions). 

We now give a brief overview of the aforementioned concepts. More details 
can be found in [Gac05,HR09c,HR09b,BGH+]. 

The algorithmic theory of randomness has been extended from the Cantor 
space to any computable metric space, i.e. any separable metric space with a 
distinguished dense countable set on which the metric is computable. A com- 
putable probability space is such a space X, endowed with a computable Borcl 
probability measure /i. A universal Martin-L6f test always exist on such spaces, 
and induces a canonical decomposition of the set of Martin-Lof random points 

ft M = Un^n with n n Q ^n+i and M^n) > 1 ~ 2 ~™ (namely, 1l» is the com- 
plement in X of the n-th level of a universal /x-Martin-L6f test). Using this 
decomposition, one can weaken many computability notions, starting with the 
notion of a computable function: we say that a function / : X — > Y (where Y 
is a computable metric space) is fi- layerwise computable if it is computable on 
each TV^ (uniformly in nf ' . Such a function may be discontinuous, but is still 
continuous on each 1Z%, which is a totally disconnected set. It turns out that 
this notion admits a characterization in terms of effective measure theory. 

Observe that /x-layerwise computability of real- valued functions is closed un- 
der basic operations such as sum, product, multiplication by a computable real 
number, and absolute value. Composition does not automatically preserve layer- 
wise computability without an assumption on the preservation of the measure. If 
/ : X — > [—00, +00] and T : X — > X are /i-layerwise computable and T preserves 
/1, then / o T is /x-layerwise computable. If, moreover, / is bounded, then J f dfi 
is computable, uniformly in / and a bound on /. In particular, ||/||i and H/H2 
are computable. 

The main reason for which layerwise computability fits well with Martin- 
Lof randomness is that Martin-Lof random points pass a class of tests that is 
wider than the usual Martin-Lof tests: the tests that, on each IZ^, "look like" 
Martin-Lof tests. 

Lemma 10 Let A n C X be such that there exist uniformly effective open sets 
U n: k such that A n n IZ^ = U n ^ fl VJ^. If n(A n ) < 2~ n for all n, then every 
^-random point is outside f] n A n . Moreover there is c such that W^ fl A n+C = 
for all n. 



7 When X — Y — fi, it means that there is a Turing machine that on input n and 
oracle x G TZ„ progressively writes f(x) on the output tape. The machine does not 
need to behave well when x £ 1Z%. 



Proof. Let V n = [/„,„ U (X \ 7££): V n is a Martin-L6f test and A n C V„. 

Let us show how to adapt a part of the proof of Theorem 7 to computable 
probability spaces and /z-layerwise computable mappings. 

Theorem 11. Let (A, /i) be a computable probability space. LetT : X — > X be a 
[i-layerwise computable, measure-preserving, ergodic transformation of X . Let A 
be an effectively open subset of X of measure less than 1. For every \i-random 
point x, there exists n such that T n (x) ^ A. 

Proof (Sketch). The proof is essentially the same as that of Theorem 7. The only 
differences are: adapting the notion of cylinder; using properties of layerwise 
computability; using Lemma 10. 

A computable probability space always admits a basis of metric balls with 
computable centers and radii, whose borders have null measure. These balls 
correspond in a sense to the cylinders of the Cantor space: for instance, their 
measures are computable. Let then B — B(x, r) be a metric ball with computable 
center and radius, such that /x({y : d(x, y) = r}) = 0. Then /i(B) is computable, 
Xb is /i-layerwise computable and for all n the function /„ := i X)fc=o Xb ° T k 
is /z-layerwise computable, uniformly in n. As a result, the L2-riorms of the 
functions /„ — /J-(B) are all uniformly computable. Hence we can effectively find 
n such that n(B n f|i<„ T- l (A)) < r/j,(B). 

In the proof of Theorem 7, the computability of T implied that the set 
B (~l Hi<n T~ l (A) was effectively open. When T is /x-layerwise computable, the 
set B n C\i< n T~ l {A) is effectively open on every 7££. We end up with a test as 
in Lemma 10 enclosing |") T~ n (A), which implies the result. □ 

In the same way, Theorems 8 and 9 are true for computable probability spaces 
and for /i-layerwise computable mappings T . In Theorem 9, the function / can 
be assumed to be /x-layerwise lower semicomputable. 



4 An application: the generalized van Lambalgen's 
theorem 

The celebrated van Lambalgen theorem [vL87] asserts that in the probability 
space Q 2 (pairs of binary sequences with independent uniformly distributed 
components) a pair (wo^i) is random if and only if luq is random and u>\ is 
cjo-random (random relative to the oracle ujq). This can be easily generalized 
to fc-tuples: an element (wo>Wi, . . . ,Wfe_i) of O fe is random if and only if luo is 
random and uii is (u>q, . . . , Wi_i)-random for all i = 1, 2 . . . , k — 1. Can we gener- 
alize this statement to infinite sequences? Not completely: there exists an infinite 
sequence (iJiji^n such that loq is random and u>i is (cjq, • • • , cji_i)-random for all 
i > 1; nevertheless, (wi)jgN is non-random as an element of J7 N . To construct 
such an example, take a random sequence in il N and then replace the first i bits 
of u>i by zeros. 



Informally, in this example all w, are random, but their "randomness de- 
ficiency" increases with i, so the entire sequence (w,) is not random (in J7 N ). 
K. Miyabe [Miy] has shown recently that one can overcome this difficulty allow- 
ing finitely many bit changes in each u>i (number of changed bits may depend 
on i): 

Theorem 12 (Miyabe). Let (wi)i e N be a sequence of elements of fl such that 
ujo is random and uji is (u>o, ■ • • , LUi-i)-random for all i > 1. Then there exists a 
sequence (u/'^i^n such that 

— For every i the sequence uj[ is equal to u>i except for a finite number of places. 

— The sequence (u^)jgN is a random element o/il N . 

Informally, this result can be explained as follows: as we have seen (Theo- 
rem 2), a change in finitely many places can decrease the randomness deficiency 
(starting from any non-random sequence, we get a sequence that is not covered 
by a first set of a Martin-L6f test) and therefore can prevent "accumulation" of 
randomness deficiency. 

This informal explanation can be formalized and works not only for finite 
changes of bits but for any ergodic transformation. In fact, the results of this 
paper allow us to get a short proof of the following generalization of Miyabc's 
result (Miyabe's original proof used a different approach, namely martingale 
characterizations of randomness). We restrict ourselves to the uniform measure, 
but the same argument works for arbitrary computable measures. 

Theorem 13. Let (wj)jgN be a sequence of elements ofQ such thatujo is random 
and uji is (ujq, . . . ,Wj_i) -random for all i > 1. Let T : O — > £1 be a computable 
bijective ergodic map. Then, there exists a sequence (w^)igN such that 

— For every i, the sequenceuj^ is an element of the orbit ofu>i (i.e., u/ = T ni (uj,i) 
for some integer Hi). 

— The sequence (u^)jgN * s a random element o/il N . 

Proof. Let U be the first level of a universal Martin-L6f test on J7 N , with /j,(U) < 
1/2. We will ensure that the sequence (u^g^ is outside U, and this guarantees 
its randomness. 

Consider the set Vo consisting of those «o £fi such that the section 

U aa = {(cti,a 2 , . . .) | (a ,ai,a 2 , ■ ■ ■) G U} 

has measure greater than 2/3. The measure of Vo is less than 1, otherwise we 
would have fi(U) > 1/2. It is easy to see that Vb is an effectively open subset 
of Q. Since loo is random, by Theorem 7 there exists an integer uq such that 
oj'q = T n "(ujo) is outside Vb- This u>' will be the first element of the sequence we 
are looking for. 

Now we repeat the same procedure for U u i instead of U. Note that it is an 
open set of measure at most 2/3, and, moreover, an effectively open set with 
respect to oracle u>' . Since loq and lo' differ by a computable transformation, 



the set Uu' is effectively open with oracle u>o- We repeat the same argument 
(where 1/2 and 2/3 are replaced by 2/3 and 3/4 respectively) and conclude that 
there exists an integer n\ such that the sequence uj[ = T ni (u>i ) has the following 
property: the set 

U u ' u{ = {(a 2 ,a 3 ,...) | (wo,wi,a2,a3,...) e U} 

has measure at most 3/4. (Note that we need to use wo-randomness of wi, since 
we apply Theorem 7 to an wo-effectively open set.) 

At the next step we get n 2 and w 2 = T^ n2 'u>2 such that 

Uu' u>[uj' 2 = {(a 3 ,a 4 ,...) | (wo,wi,W2,a3,a4, ...) e £7} 

has measure at most 4/5, etc. 

Is it possible that the resulting sequence (u}' ,ij'i, ui' 2 , ■ ■ .) is covered by U? 
Since U is open, it would be then covered by some interval in U . This interval 
may refer only to finitely many coordinates, so for some m all sequences 

( W 0> W 1> • • • > UJ m _ 1 ,a m ,a m+ i, • • •) 

would belong to U (for every a m , a m +i, . . .). However, this is impossible because 
our construction ensures that the measure of the set of all (a m , a TO +i, . . .) with 
this property is less than 1. □ 

Of course, the discussion of Section 3.3 shows that Theorem 13 can be ex- 
tended to any computable probability space instead of the Cantor space, and to 
a layerwise computable ergodic map instead of a computable one. The details 
are left to the reader. 
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